Estimating Average-Case Learning Curves Using Bayesian, Statistical Physics and VC Dimension Methods
نویسندگان
چکیده
Michael Kearns· AT&T Bell Laboratories Murray Hill, New Jersey Robert Schapire AT&T Bell Laboratories Murray Hill, New Jersey In this paper we investigate an average-case model of concept learning, and give results that place the popular statistical physics and VC dimension theories of learning curve behavior in a common framework.
منابع مشابه
Theorem 3 Let P Be a Nondegenerate Prior on F and Q Be Any Distribution on F. Let
In this paper we study a Bayesian or average-case model of concept learning with a twofold goal: to provide more precise characterizations of learning curve (sample complexity) behavior that depend on properties of both the prior distribution over concepts and the sequence of instances seen by the learner, and to smoothly unite in a common framework the popular statistical physics and VC dimens...
متن کاملOn the VC-Dimension of the Choquet Integral
The idea of using the Choquet integral as an aggregation operator in machine learning has gained increasing attention in recent years, and a number of corresponding methods have already been proposed. Complementing these contributions from a more theoretical perspective, this paper addresses the following question: What is the VC dimension of the (discrete) Choquet integral when being used as a...
متن کاملLearning curves from a modified VC-formalism: a case study
In this paper we present a case study of a 1-dimensional higher order neuron using a statistical approach to learning theory which incorporates some information on the distribution on the sample space and can be viewed as a modification of the Vapnik-Chervonenkis formalism (VC-formalism). We concentrate on learning curves defined as averages of the worst generalization error of binary hypothesi...
متن کاملRequired sample size for learning sparse Bayesian networks with many variables
Learning joint probability distributions on n random variables requires exponential sample size in the generic case. Here we consider the case that a temporal (or causal) order of the variables is known and that the (unknown) graph of causal dependencies has bounded in-degree ∆. Then the joint measure is uniquely determined by the probabilities of all (2∆ + 1)-tuples. Upper bounds on the sample...
متن کاملMLP Can Provably Generalize Much Better than VC-bounds Indicate
Results of a study of the worst case learning curves for a particular class of probability distribution on input space to MLP with hard threshold hidden units are presented. It is shown in particular, that in the thermodynamic limit for scaling by the number of connections to the first hidden layer, although the true learning curve behaves as ~ a-I for a ~ 1, its VC-dimension based bound is tri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1991